Enabling Precise Interpretations of Software Change Data

نویسنده

  • David Kawrykow
چکیده

Numerous techniques mine change data captured in software archives to assist software engineering efforts. These change-based approaches typically analyze change sets – groups of co-committed changes – under the assumption that the development work represented by change sets is both meaningful and related to a single change task. However, we have found that change sets often violate this assumption by containing changes that we consider to be non-essential, or less likely to be representative of the kind of meaningful software development effort that is most interesting to typical change-based approaches. Furthermore, we have found many change sets addressing multiple subtasks – groups of isolated changes that are related to each other, but not to other changes within a change set. Information mined from such change sets has the potential for interfering with the analyses of various change-based approaches. We propose a catalog of non-essential changes and describe an automated technique for detecting such changes within version histories. We used our technique to conduct an empirical investigation of over 30 000 change sets capturing over 25 years of cumulative development activity in ten open-source Java systems. Our investigation found that between 3% and 26% of all modified code lines and between 2% and 16% of all method updates consisted entirely of non-essential modifications. We further found that eliminating such modifications reduces the amount of false positive recommendations that would be made by an existing association rule miner. These findings are supported by a manual evaluation of our detection technique, in which we found that our technique falsely identifies non-essential method updates in only 0.2% of all cases. These observations should be kept in mind when interpreting insights derived from version repositories. We also propose a formal definition of “subtasks” and present an automated technique

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computation of evolutionary change

A key issue in software evolution analysis is being able to compute evolutionary change accurately and with rich semantics. This dissertation describes a mathematical framework for enabling accurate computation of semantic evolutionary change. It is based on graphs for representing software semantics, graph transformations for modeling evolution, and effects of graph transformations for capturi...

متن کامل

ii DEDICATION To Kimberly

A key issue in software evolution analysis is being able to compute evolutionary change accurately and with rich semantics. This dissertation describes a mathematical framework for enabling accurate computation of semantic evolutionary change. It is based on graphs for representing software semantics, graph transformations for modeling evolution, and effects of graph transformations for capturi...

متن کامل

How many trimers? Modeling influenza virus fusion yields a minimum aggregate size of six trimers, three of which are fusogenic.

Conflicting reports in leading journals have indicated the minimum number of influenza hemagglutinin (HA) trimers required for fusion to be between one and eight. Interestingly, the data in these reports are either almost identical, or can be transformed to be directly comparable. Different statistical or phenomenological models, however, were used to analyze these data, resulting in the varied...

متن کامل

The Impacts of Climate Change on Agriculture Hormozgan Province and Compatibility With it (Cucumber and Watermelon product)

Introduction and Background: The agricultural sector is one of the most important economic sectors due to extensive interaction with the environment, the greatest impact of climate change takes effect.  The agricultural sector affects climate change and affects climate change. Climate change, on the one hand, affects agricultural performance and, on the other hand, affects the price of produ...

متن کامل

The Embedded Health Management Academic: A Boundary Spanning Role for Enabling Knowledge Translation; Comment on “CIHR Health System Impact Fellows: Reflections on ‘Driving Change’ Within the Health System”

Healthcare organisations are looking at strategies and activities to improve patient outcomes, beyond clinical interventions. Increasingly, health organisations are investing significant resources in leadership, management and team work training to optimise professional collaboration, shared decision-making and, by extension, high quality services. Embedded clinical aca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011